🐿️ ScourBrowse
LoginSign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
📄 Text Segmentation

Boundary Detection, Sentence Splitting, Document Structure, Natural Language Processing

Text2Struct: A Machine Learning Pipeline for Mining Structured Data from Text
arxiv.org·1d
🔤Character Classification
Clustering News Articles for Topic Detection: A Technical Deep Dive
dev.to·3d·
Discuss: DEV
📚Document Clustering
A Deep Convolutional Neural Network-Based Novel Class Balancing for Imbalance Data Segmentation
arxiv.org·1d
🧠Machine Learning
davidchisnall/igk: I got Knuth'd: A compiler for documents
github.com·7h
📝Concrete Syntax
What LLMs Know About Their Users
schneier.com·2h
💻Local LLMs
The modern text processing pipeline: Overview
newroadoldway.com·1d·
Discuss: Lobsters, r/programming
🔤Unicode Normalization
Why Your Chunking Strategy Makes or Breaks Your AI System
medium.com·4d·
Discuss: Hacker News
📄Text Chunking
Multimodal Political Bias Identification and Neutralization
arxiv.org·1d
🤖Advanced OCR
Detecting Machine-Generated Texts: Not Just "AI vs Humans" and Explainability is Complicated
arxiv.org·10h
🧮Kolmogorov Complexity
Why Your Next LLM Might Not Have A Tokenizer
towardsdatascience.com·18h
🤖Grammar Induction
Markov-Enhanced Clustering for Long Document Summarization: Tackling the 'Lost in the Middle' Challenge with Large Language Models
arxiv.org·1d
📄Text Chunking
StoryGem: Voronoi treemap Approach for Semantics-Preserving Text Visualization
arxiv.org·1d
🔶Voronoi Diagrams
Semantic-Aware Parsing for Security Logs
arxiv.org·1d
📝Log Parsing
Machine Learning Fundamentals: active learning
dev.to·21h·
Discuss: DEV
🤖Grammar Induction
QuranMorph: Morphologically Annotated Quranic Corpus
arxiv.org·1d
📋Document Grammar
MemeMind: A Large-Scale Multimodal Dataset with Chain-of-Thought Reasoning for Harmful Meme Detection
arxiv.org·10h
🧮Vector Embeddings
Practical tips to optimize documentation for LLMs, AI agents, and chatbots
biel.ai·18h·
Discuss: Hacker News
🤖Archive Automation
Beyond the Link: Assessing LLMs' ability to Classify Political Content across Global Media
arxiv.org·1d
📄Text Chunking
AI’s ‘Neutral Voice’ Is a Structural Illusion
hackernoon.com·7h
🤖Grammar Induction
Computational Approaches to Understanding Large Language Model Impact on Writing and Information Ecosystems
arxiv.org·1d
📜Digital Philology
Loading...Loading more...
AboutBlogChangelogRoadmap